Graph-Based Clustering for Computational Linguistics: A Survey

نویسندگان

  • Zheng Chen
  • Heng Ji
چکیده

In this survey we overview graph-based clustering and its applications in computational linguistics. We summarize graph-based clustering as a five-part story: hypothesis, modeling, measure, algorithm and evaluation. We then survey three typical NLP problems in which graph-based clustering approaches have been successfully applied. Finally, we comment on the strengths and weaknesses of graph-based clustering and envision that graph-based clustering is a promising solution for some emerging NLP problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evolutionary Graph Clustering using Graph and Cluster Mixtures

Many networks and accordingly their representation in graphs are subject to structural changes during the course of their existence [CKT06]. Examples for such evolutionary networks include friendship networks in online communities, co-authorship networks in the scientific domain and collocation networks in computational linguistics. Studying the evolution of such networks can provide vital data...

متن کامل

A partition-based algorithm for clustering large-scale software systems

Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...

متن کامل

An Application of Operational Research to Computational Linguistics: Word Ambiguity

This paper draws on graph theory and optimization techniques to develop a new measure of word ambiguity (e.g., homonymy and polysemy) for use in psycholinguistic research. This measure provides information regarding the uncertainty of the intended meaning of English words. Specifically, data about sixty-four thousand distinct words was collected from a corpus of close to three hundred million w...

متن کامل

A Graph-Based Clustering Approach to Identify Cell Populations in Single-Cell RNA Sequencing Data

Introduction: The emergence of single-cell RNA-sequencing (scRNA-seq) technology has provided new information about the structure of cells, and provided data with very high resolution of the expression of different genes for each cell at a single time. One of the main uses of scRNA-seq is data clustering based on expressed genes, which sometimes leads to the detection of rare cell populations. ...

متن کامل

Assessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories

In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010